Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

storage: randomly mangle range key buffers in test #97222

Merged
merged 1 commit into from
Feb 27, 2023

Conversation

jbowens
Copy link
Collaborator

@jbowens jbowens commented Feb 15, 2023

Randomly mangle buffers for range key bounds, suffixes and values when the
iterator becomes invalid or RangeKeyChanged().

Additionally, fix a bug in pebbleMVCCScanner surfaced by this mangling, whereby
the scanner could step an iterator positioned at a bare range key into an
exhausted position but continue to make use of pointers to the previous
iterator position's range key buffers.

Epic: None
Release note: None

@jbowens jbowens requested a review from a team as a code owner February 15, 2023 22:52
@jbowens jbowens requested a review from bananabrick February 15, 2023 22:52
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@jbowens jbowens requested a review from erikgrinaker February 15, 2023 22:52
Copy link
Contributor

@erikgrinaker erikgrinaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks.

// Stepping might move the iterator entirely off the bare range key, at
// which point the previous curRangeKeys buffers are invalidated.
p.curRangeKeys.CloneInto(&p.savedRangeKeys)
p.curRangeKeys = p.savedRangeKeys
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would only happen for an exhausted iterator, since we'd otherwise loop around and detect this via RangeKeyChanged(). Why are we keeping the range keys around in this case? I think the right fix would be to set p.curRangeKeys = MVCCRangeKeyStack{} in the if !p.iterValid() branch below.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we keeping the range keys around in this case?

In tombstone mode we still care that there was a MVCC range tombstone, even if there's no point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, for get(), right. The implementation there is a bit hacky, there's probably a better way to do it. I think we should try to avoid doing this unnecessary copying in the general case for scans. Would be happy to have a look at it when I have a chance, if this isn't urgent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making copies only in tombstone mode seems ok, we don't use that very often. I'll see if I can improve this separately.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Submitted an alternative in #97327 which I think is slightly cleaner and only pays the cost in the one specific case that we need it (a tombstone point get).

Copy link
Contributor

@erikgrinaker erikgrinaker Feb 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, #97327 won't work, we'll have to do something like what you're doing here. We can consider adding an isGet field (I think we used to have one?) and only do this in that case, but there aren't currently any callers that set MVCCScanOptions.Tombstones so it isn't particularly important right now.

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bananabrick and @jbowens)


pkg/storage/pebbleiter/crdb_test_on.go line 58 at r2 (raw file):

	rangeKeyBufs struct {
		idx   int
		start [2][]byte

[nit] Things would be cleaner if we had an enclosing struct for these:

rangeKeyBuf struct {
  start []byte
  end []byte
  keys []byte
}
rangeKeyBufs struct {
  idx int
  bufs [2]rangeKeyBuf
}

Then in the code you just do buf := &rangeKeyBufs.bufs[rangeKeyBufs.idx] and you don't have to worry about idx again.


pkg/storage/pebbleiter/crdb_test_on.go line 237 at r2 (raw file):

	if rand.Intn(2) == 0 {
		idx := i.rangeKeyBufs.idx
		for _, b := range [...][]byte{i.rangeKeyBufs.start[idx], i.rangeKeyBufs.end[idx]} {

[nit] This looks pretty cryptic, maybe define a clear([]byte) func and call that without the loops.


func (i *assertionIter) RangeKeys() []pebble.RangeKeyData {
if !i.Valid() {
panic(errors.AssertionFailedf("RangeKeys() called on !Valid() pebble.Iterator"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't this valid? We document that at least RangeBounds() returns nothing when there are no range keys, and I'd expect RangeKeys() to do the same. Is there a compelling reason not to allow this?

// Stepping might move the iterator entirely off the bare range key, at
// which point the previous curRangeKeys buffers are invalidated.
p.curRangeKeys.CloneInto(&p.savedRangeKeys)
p.curRangeKeys = p.savedRangeKeys
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Submitted an alternative in #97327 which I think is slightly cleaner and only pays the cost in the one specific case that we need it (a tombstone point get).

Copy link
Collaborator Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's still some problems this revealed that I need to work out.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bananabrick and @erikgrinaker)


pkg/storage/pebbleiter/crdb_test_on.go line 119 at r2 (raw file):

Previously, erikgrinaker (Erik Grinaker) wrote…

Why isn't this valid? We document that at least RangeBounds() returns nothing when there are no range keys, and I'd expect RangeKeys() to do the same. Is there a compelling reason not to allow this?

I'm planning on restricting both RangeBounds and RangeKeys to require an already valid iterator with a range key. The validity checks are not trivial, including within Pebble and allowing this just duplicates them. It also makes an easier to misuse interface, because it's easy to rely on RangeBounds/RangeKeys without having ever checked if there was an iterator error.

@jbowens jbowens force-pushed the rangekey-mangle branch 8 times, most recently from bd682fb to 4de48fa Compare February 23, 2023 14:56
jbowens added a commit to jbowens/cockroach that referenced this pull request Feb 23, 2023
Backport cockroachdb#90859, cockroachdb#96685 and cockroachdb#97222 to mangle the buffers returned by Pebble
iterators to ensure MVCC and Cockroach code respects the Pebble iterator memory
lifetimes.

The RangeBounds() and RangeKeys() validity assertions are omitted, since 22.2
allowed these methods to be invoked on an invalid iterator or an iterator
positioned at a point key.

Release note: None
Release justification: Non-production code changes
Copy link
Collaborator Author

@jbowens jbowens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @bananabrick and @erikgrinaker)


pkg/storage/pebble_mvcc_scanner.go line 1528 at r1 (raw file):

Previously, erikgrinaker (Erik Grinaker) wrote…

Nevermind, #97327 won't work, we'll have to do something like what you're doing here. We can consider adding an isGet field (I think we used to have one?) and only do this in that case, but there aren't currently any callers that set MVCCScanOptions.Tombstones so it isn't particularly important right now.

I updated the PR to keep this copying local to get(). This avoids the copying for scans, and also fixes another problem the earlier processRangeKeys change did not fix: As a part of getOne, we might call seekVersion which can also invalidate the range keys, leaving garbage by the time get() synthesizes the tombstone.

I think this is ready for a look again, although I still need to produce a mvcc histories test case for the seekVersion case. I've had trouble reproducing it because it has some subtle requirements.


pkg/storage/pebbleiter/crdb_test_on.go line 58 at r2 (raw file):

Previously, RaduBerinde wrote…

[nit] Things would be cleaner if we had an enclosing struct for these:

rangeKeyBuf struct {
  start []byte
  end []byte
  keys []byte
}
rangeKeyBufs struct {
  idx int
  bufs [2]rangeKeyBuf
}

Then in the code you just do buf := &rangeKeyBufs.bufs[rangeKeyBufs.idx] and you don't have to worry about idx again.

Done.


pkg/storage/pebbleiter/crdb_test_on.go line 237 at r2 (raw file):

Previously, RaduBerinde wrote…

[nit] This looks pretty cryptic, maybe define a clear([]byte) func and call that without the loops.

Done.

Randomly mangle buffers for range key bounds, suffixes and values when the
iterator becomes invalid or RangeKeyChanged(). Additionally, fix two bugs
surfaced by this mangling:

The first bug was in pebbleMVCCScanner: The scanner could step an iterator
positioned at a bare range key into an exhausted position but continue to make
use of pointers to the previous iterator position's range key buffers.

The second bug was in CheckSSTConflicts: The extIter was repositioned without
saving or re-retrieving the range keys.

Epic: None
Release note: None
jbowens added a commit to jbowens/cockroach that referenced this pull request Feb 24, 2023
Backport cockroachdb#90859, cockroachdb#96685 and cockroachdb#97222 to mangle the buffers returned by Pebble
iterators to ensure MVCC and Cockroach code respects the Pebble iterator memory
lifetimes.

The RangeBounds() and RangeKeys() validity assertions are omitted, since 22.2
allowed these methods to be invoked on an invalid iterator or an iterator
positioned at a point key.

Release note: None
Release justification: Non-production code changes
jbowens added a commit to jbowens/cockroach that referenced this pull request Feb 24, 2023
Fix a bug whereby CheckSSTConflicts could seek the engine iterator and seek
back, assuming that the originally returned buffers were still valid. This is
only true if RangeKeyChanged()=false after both seeks.

Backport of part of cockroachdb#97222.

Epic: None
Release note: None
@jbowens
Copy link
Collaborator Author

jbowens commented Feb 27, 2023

TFTRs!

bors r=erikgrinaker

@craig
Copy link
Contributor

craig bot commented Feb 27, 2023

Build succeeded:

@craig craig bot merged commit 7730070 into cockroachdb:master Feb 27, 2023
@jbowens jbowens deleted the rangekey-mangle branch February 27, 2023 18:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants